Spark Tasking v3

This document describes the third iteration on designing deterministic task assignment to Spark checker nodes. Prior work:

Motivation

ATM, checker nodes can choose the tasks in whatever way they want. This allows SPs to collude, e.g. by checking only their deals.

The v3 design makes the task→node assignment deterministic, nodes can no longer choose the tasks.

This improvement is a pre-requisite for further protocol improvements:

Ensure most committees are large enough to give us confidence in majority being honest

Group measurements by the task (the committee), find & accept the majority, reject minority results.

Requirements

Make Sybil attacks difficult/expensive
We want to limit how many checker instances a single party operates by disincentivizing operating too many instances.

Make SP collusion difficult
We want to make it difficult to run colluding SPARK nodes that will limit the retrieval checks only to a given SP or perhaps a small fixed set of CIDs.
Essentially, the checker nodes must not be able to self deal (choose the tasks to perform).

(Mostly) uniform committee sizes
Our “Proof of Retrieval” solution requires a committee where an honest majority can be formed.
At the end of each round, we want all committees to have roughly the same size. The size must be large enough to give us confidence that the majority is honest. (A committee is a list of nodes that measured the given task - the pair (CID, minerId).)
🤔
One thing to take into account is that honest majority is not always enough given checkers can experience false positives due to network issues etc.

No membership service
The algorithm must assign tasks to checker nodes in a probabilistic manner that does not require us to maintain a list of online nodes (a membership service).

Rate Limiting
We want to rate-limit how many jobs are performed by each checker.
- Without rate limiting & fraud detection, a fraudulent node can short-circuit job execution by skipping the retrieval and reporting fake results instead. This allows nodes to cheaply report many completed jobs, which is later translated to disproportionally large impact & reward.
- Even with fraud detection in place, if we don’t rate-limit the checks, then nodes on fast, unmetered internet connections can tweak the SPARK checker code to perform more checks than we designed for. This can put too much pressure on SPs. Plus, the operator will get a larger portion of rewards, which is unfair to honest nodes.
- We want fairness - i.e. guarantees that no party can amplify their fraction of the total reports - otherwise even if I control 1% of the nodes, if those are all “fast” nodes, I could still do 50% of total logs potentially, so it’s about sybils/scare resources.
- We must avoid DDoSing SPs and spread the load across all SPs. If the entire SPARK network checks the same (CID, SP) pair, we will effectively DDoS the poor SP.

IPv4 as a scarce resource
The algorithm must disincentivise operators running more than one node in a given IPv4 /24 subnet.

Cheap to verify
The Station network has ~30k nodes, ~4k unique participant addresses. Each round, we test 1k different tasks and collect ~700k measurements. The tasking algorithm must allow the fraud-detection service to efficiently verify whether a measurement was submitted for a task that’s valid for the given node and round.

Out of the scope (for now)

How to choose a list of tasks for the current round from all deals that are expected to be retrievable.

How to maintain the number of jobs performed per round at a constant rate that’s independent of the network size.

Current status (May 2024)

For each round, the Round Tracker service picks 1000 tasks to choose from.

Checker nodes pick tasks at random (or at their will) from that list.

IPv4 as a scarce resource + rate limit: The Evalute service accepts only the first 15 measurements from each IPv4 /24 subnet, ignoring duplicate measurements for the same task.

Available inputs

DRAND beacon - 32 bytes (hex encoded as 64 chars) on the default drand chain

Participant address in ETH 0x format, not unique; controlled by the checker node operator

Station ID - a public key, 44 bytes (hex encoded as 88 chars), unique; controlled by the checker node operator

A list of tasks for the current round. A task is defined as (cid, miner_id).

Proposed design

Used XOR-based metric to find T tasks that are closest to the Station node.

Note: Station IDs are 44 bytes long and SHA256 produces digests that are 32 bytes long. When you do XOR between a 44-byte value (station ID) and a 32-byte value (hash), you will get a value with the first 12 bytes always set to the same value. I.e. for each nodeKey, the dist values will always start with the same 12 bytes. While this isn’t a problem for picking T closest task, I find it confusing and therefore prefer to use only the last 32 bytes from the Station ID value.

For each task that’s valid for the round, calculate its key by hashing the task and the randomness for the current Spark round. The way the hash function works, these hash values will be randomly & uniformly distributed in the entire space of 2^256 values. By including the randomness in the inputs, we will get a different mapping in each round.
```
taskKey = sha256([cid, miner_id, drand].join('\n'))
```

For each node, use the last 32 bytes (64 hex characters) from the station_id as the key.
```
nodeKey = Buffer.from(station_id.slice(-64), 'hex')
```
UPDATE
Let’s hash the station id instead, see
```
nodeKey = sha256(Buffer.from(station_id))
```

Find K tasks closest to the node using XOR as the distance metric.
```
dist = taskKey ^ nodeKey
```
This can be implemented using a Priority Queue in O(T * log(K)) time where T is the number of tasks for this round and K is the number of tasks each node can perform. Note that T >> K - currently T=1000 and K=15 .

IPv4 as a scarce resource

The fraud detection algorithm already rewards only 15 measurements from each IPv4 /24 subnet. The tasking algorithm described above is orthogonal and does not require any changes to be made in this algorithm.

Verification

The fraud detection component needs to:

Calculate T task keys (Call T times the sha256 algorithm)

Assuming N unique nodes, calculate N sets of tasks assigned to each node. (Call P times the k-closest-element algorithm on a constant data set).

For each measurement, check that it corresponds to a valid task for the node that submitted it.

Attack vectors

Self-dealing

A malicious actor can pick the tasks they want to check and then choose such a Station ID value that makes the task belong to the k-closest tasks for that Station ID.

We can make this attack less attractive by giving more weight to measurements contributed by Station IDs with longer history. This weight can be applied to rewards but also when aggregating measurements to produce the Retrieval Success Rate and other metrics.

DoS

A malicious actor can provide a unique station ID for each measurement, forcing the fraud detection component to run the k-closest-element algorithm for each measurement.

This remains an open problem.

Performance

When using the heap-based k-closest-element implementation to build the list of K closest tasks for each node, the duration of the evaluation of a single round increases by 50% from 6 seconds to 9 seconds.

I think we can further improve the performance of this part by rewriting the algorithm in Rust & WebAssembly, but I propose to defer such optimisations until later.

Committee sizes

Using the measurements submitted for the round 10788, if we assume that all station instances keep the same inet_group for the entire duration of this round, and submit one measurement for each valid task, we will get the following distribution. The last column shows the current distribution when nodes pick their tasks randomly.

Committees created by deterministic tasking

METRIC	NEXT GEN	CURRENT
~~[REMOVED: measurements_max]~~	1783	363
~~[REMOVED: measurements_mean]~~	916	176
~~[REMOVED: measurements_min]~~	473	29
~~[REMOVED: measurements_p1]~~	504	37
~~[REMOVED: measurements_p5]~~	577	42
~~[REMOVED: measurements_p10]~~	647	44
~~[REMOVED: measurements_p50]~~	945	158
~~[REMOVED: measurements_p90]~~	1115	316
~~[REMOVED: measurements_p95]~~	1265	324
~~[REMOVED: measurements_p99]~~	1656	342
~~[REMOVED: nodes_max]~~	1045	322
~~[REMOVED: nodes_mean]~~	635	162
~~[REMOVED: nodes_min]~~	380	29
~~[REMOVED: nodes_p1]~~	404	36
~~[REMOVED: nodes_p5]~~	445	41
~~[REMOVED: nodes_p10]~~	486	44
~~[REMOVED: nodes_p50]~~	653	147
~~[REMOVED: nodes_p90]~~	747	287
~~[REMOVED: nodes_p95]~~	809	295
~~[REMOVED: nodes_p99]~~	1006	310
~~[REMOVED: participants_max]~~	296	183
~~[REMOVED: participants_mean]~~	153	85
~~[REMOVED: participants_min]~~	87	11
~~[REMOVED: participants_p1]~~	93	17
~~[REMOVED: participants_p5]~~	100	20
~~[REMOVED: participants_p10]~~	109	22
~~[REMOVED: participants_p50]~~	157	76
~~[REMOVED: participants_p90]~~	185	155
~~[REMOVED: participants_p95]~~	202	161
~~[REMOVED: participants_p99]~~	279	170
~~[REMOVED: subnets_max]~~	654	283
~~[REMOVED: subnets_mean]~~	440	144
~~[REMOVED: subnets_min]~~	287	28
~~[REMOVED: subnets_p1]~~	303	35
~~[REMOVED: subnets_p5]~~	334	40
~~[REMOVED: subnets_p10]~~	358	42
~~[REMOVED: subnets_p50]~~	448	135
~~[REMOVED: subnets_p90]~~	500	252
~~[REMOVED: subnets_p95]~~	532	258
~~[REMOVED: subnets_p99]~~	633	269

How to read the tables above

Compare the difference between min/mean/max numbers.

subnets

Before the change, the smallest committee had measurements from 28 different subnets only. After the change, all committtes have at least 287 subnets.

Let’s look at the min-p10-p90-max ranges, and also normalise the values so that the mean average is the same for “before” and “after” data:

	min	p10	p90	max
After (raw)	287	358	500	654
After (norm)	93	117	163	207
Before	28	42	252	283

It’s clear that the new algorithm reduces the variance in the number of participants per committee.

participants

Before the change, the smallest committee had measurements from 11 different participants only. After the change, all committtes have at least 87 participants.

Let’s look at the min-p10-p90-max ranges, and also normalise the values so that the mean average is the same for “before” and “after” data:

	min	p10	p90	max
After (raw)	87	109	185	296
After (norm)	48	61	103	164
Before	11	22	155	183

It’s clear that the new algorithm reduces the variance in the number of participants per committee.

measurements

The number of dummy (next-gen) measurements is 2.6x higher than the number of real measurements submitted in the inspected round. When we normalize the percentile values coming from the deterministic tasking algorithm, we get the following table:

	min	p10	p90	max
After (raw)	473	647	1115	1783
After (norm)	91	124	214	343
Before	29	44	316	363

Again, the new algorithm spread the measurements more evenly and reduces the number of committees that have very few or too many measurements compared to the average committee.

Conclusion

The proposed design has acceptable performance and significantly improves the quality of committees.

To have enough confidence that a majority of each committee is honest, we need each committee to have at least 40-50 participants. The current algorithm does not meet that requirement for at least 10% of committees. The proposed algorithm seems to meet this requirement in all committees.

Update

I realised that using the Station ID, which is a public key, may not give us uniform distribution, because there are no guarantees that public keys are uniformly distributes. (Or are there?)

A better solution is to hash the Station ID using the same hashing function we use for building the task key.

Comparison of this new version to the algorithm described above:

	min	p10	p90	max
participants v1	87	109	185	296
participants v2	77	111	185	290
subnets v1	287	358	500	654
subnets v2	286	353	498	652
measurements v1	473	647	1115	1783
measurements v2	470	657	1128	1855

Interestingly enough, the more “correct” algorithm gives us slightly worse committtees. Considering the difference is very small, I propose to hash the Station ID to get more confidence in the correctness of our algorithm.